Revisiting Batch Norm Initialization

نویسندگان

چکیده

Batch normalization (BN) is comprised of a component followed by an affine transformation and has become essential for training deep neural networks. Standard initialization each BN in network sets the scale shift to 1 0, respectively. However, after we have observed that these parameters do not alter much from their initialization. Furthermore, noticed process can still yield overly large values, which undesirable training. We revisit formulation present new method update approach address aforementioned issues. Experiments are designed emphasize demonstrate positive influence proper on performance, use rigorous statistical significance tests evaluation. The be used with existing implementations at no additional computational cost. Source code available https://github.com/osu-cvl/revisiting-bn-init .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Revisiting Norm Estimation in Data Streams

We revisit the problem of (1±ε)-approximating the Lp norm, 0 ≤ p ≤ 2, of an n-dimensional vector updated in a stream of length m with positive and negative updates to its coordinates. We give several new upper and lower bounds, some of which are optimal. LOWER BOUNDS.We show that for the interesting range of parameters, Ω(ε log(nm)) bits of space are necessary for estimating Lp in one pass for ...

متن کامل

Revisiting Batch Normalization For Practical Domain Adaptation

Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection. However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain. Recent study (Tommasi et al., 2015) shows that a DNN has strong dependenc...

متن کامل

Adjusting for Dropout Variance in Batch Normalization and Weight Initialization

We show how to adjust for the variance introduced by dropout with corrections to weight initialization and Batch Normalization, yielding higher accuracy. Though dropout can preserve the expected input to a neuron between train and test, the variance of the input differs. We thus propose a new weight initialization by correcting for the influence of dropout rates and an arbitrary nonlinearity’s ...

متن کامل

Revisiting the Problem of Weight Initialization for Multi-Layer Perceptrons Trained with Back Propagation

One of the main reasons for the slow convergence and the suboptimal generalization results of MLP (Multilayer Perceptrons) based on gradient descent training is the lack of a proper initialization of the weights to be adjusted. Even sophisticated learning procedures are not able to compensate for bad initial values of weights, while good initial guess leads to fast convergence and or better gen...

متن کامل

L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks

Batch Normalization (BN) has been proven to be quite effective at accelerating and improving the training of deep neural networks (DNNs). However, BN brings additional computation, consumes more memory and generally slows down the training process by a large margin, which aggravates the training effort. Furthermore, the nonlinear square and root operations in BN also impede the low bit-width qu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-19803-8_13